High-accuracy value-function approximation with neural networks applied to the acrobot

نویسنده

  • Rémi Coulom
چکیده

Several reinforcement-learning techniques have already been applied to the Acrobot control problem, using linear function approximators to estimate the value function. In this paper, we present experimental results obtained by using a feedforward neural network instead. The learning algorithm used was model-based continuous TD(λ). It generated an efficient controller, producing a high-accuracy state-value function. A striking feature of this value function is a very sharp 4-dimensional ridge that is extremely hard to evaluate with linear parametric approximators. From a broader point of view, this experimental success demonstrates some of the qualities of feedforward neural networks in comparison with linear approximators in reinforcement learning.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

STRUCTURAL DAMAGE DETECTION BY MODEL UPDATING METHOD BASED ON CASCADE FEED-FORWARD NEURAL NETWORK AS AN EFFICIENT APPROXIMATION MECHANISM

Vibration based techniques of structural damage detection using model updating method, are computationally expensive for large-scale structures. In this study, after locating precisely the eventual damage of a structure using modal strain energy based index (MSEBI), To efficiently reduce the computational cost of model updating during the optimization process of damage severity detection, the M...

متن کامل

ارائه مدلی جهت پیش بینی بیماری دیابت با استفاده از شبکه عصبی

Introduction: Meta-heuristic and combined algorithms have a great capability in modelling medical decision making. This study used neural networks in order to predict Type 2 Diabetes (T2D) among high risk individuals. Methods: This study was   an applied research. Data from 545 individuals (diabetic and non-diabetic), in Diabetes Clinic of Hamedan University of Medical Sciences, we...

متن کامل

Comparison of the performances of neural networks specification, the Translog and the Fourier flexible forms when different production technologies are used

This paper investigates the performances of artificial neural networks approximation, the Translog and the Fourier flexible functional forms for the cost function, when different production technologies are used. Using simulated data bases, the author provides a comparison in terms of capability to reproduce input demands and in terms of the corresponding input elasticities of substitution esti...

متن کامل

Neural Network Performance Analysis for Real Time Hand Gesture Tracking Based on Hu Moment and Hybrid Features

This paper presents a comparison study between the multilayer perceptron (MLP) and radial basis function (RBF) neural networks with supervised learning and back propagation algorithm to track hand gestures. Both networks have two output classes which are hand and face. Skin is detected by a regional based algorithm in the image, and then networks are applied on video sequences frame by frame in...

متن کامل

Implementation of a programmable neuron in CNTFET technology for low-power neural networks

Circuit-level implementation of a novel neuron has been discussed in this article. A low-power Activation Function (AF) circuit is introduced in this paper, which is then combined with a highly linear synapse circuit to form the neuron architecture. Designed in Carbon Nanotube Field-Effect Transistor (CNTFET) technology, the proposed structure consumes low power, which makes it suitable for the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004